与LTE网络相比,5G的愿景在于提供较高的数据速率,低延迟(为了实现近实时应用程序),大大增加了基站容量以及用户的接近完美服务质量(QoS)。为了提供此类服务,5G系统将支持LTE,NR,NR-U和Wi-Fi等访问技术的各种组合。每种无线电访问技术(RAT)都提供不同类型的访问,这些访问应在用户中对其进行最佳分配和管理。除了资源管理外,5G系统还将支持双重连接服务。因此,网络的编排对于系统经理在旧式访问技术方面来说是一个更困难的问题。在本文中,我们提出了一种基于联合元学习(FML)的大鼠分配算法,该算法使RAN Intelligent Controller(RIC)能够更快地适应动态变化的环境。我们设计了一个包含LTE和5G NR服务技术的模拟环境。在模拟中,我们的目标是在传输的截止日期内满足UE需求,以提供更高的QoS值。我们将提出的算法与单个RL试剂,爬行动物算法和基于规则的启发式方法进行了比较。仿真结果表明,提出的FML方法分别在第一部部署回合21%和12%时达到了较高的缓存率。此外,在比较方法中,提出的方法最快地适应了新任务和环境。
translated by 谷歌翻译
下一代网络将积极采用人工智能(AI)和机器学习(ML)技术,用于自动化网络和最佳网络操作策略。以Open Ran(O-Ran)为代表的新兴网络结构符合这一趋势,其规范中心的无线电智能控制器(RIC)用作ML应用程序主机。各种ML模型,尤其是强化学习(RL)模型,被认为是解决与RAN相关的多目标优化问题的关键。但是,应该认识到,当前大多数RL成功都局限于抽象和简化的仿真环境,这可能不会直接转化为复杂的真实环境中的高性能。主要原因之一是模拟与真实环境之间的建模差距,这可能会使RL代理通过模拟训练不适合真实环境。此问题称为SIM2REAL差距。本文在O-Ran的背景下引起了SIM2REAL挑战。具体而言,它强调了数字双胞胎(DT)可以作为模型开发和验证的地方的特征和好处。提出了几种用例,以举例说明并证明在真实环境中训练有训练的RL模型的故障模式。讨论了DT在协助RL算法开发方面的有效性。然后提出了通常用于克服SIM2REAL挑战的基于学习的基于艺术学习的方法。最后,从数据交互,环境瓶颈和算法设计等潜在问题的角度讨论了O-RAN中RL应用程序实现的开发和部署问题。
translated by 谷歌翻译
无线电接入网络(RAN)技术继续见证巨大的增长,开放式运行越来越最近的势头。在O-RAN规范中,RAN智能控制器(RIC)用作自动化主机。本文介绍了对O-RAN堆栈相关的机器学习(ML)的原则,特别是加强学习(RL)。此外,我们审查无线网络的最先进的研究,并将其投入到RAN框架和O-RAN架构的层次结构上。我们在整个开发生命周期中提供ML / RL模型面临的挑战的分类:从系统规范到生产部署(数据采集,模型设计,测试和管理等)。为了解决挑战,我们将一组现有的MLOPS原理整合,当考虑RL代理时,具有独特的特性。本文讨论了系统的生命周期模型开发,测试和验证管道,称为:RLOPS。我们讨论了RLOP的所有基本部分,包括:模型规范,开发和蒸馏,生产环境服务,运营监控,安全/安全和数据工程平台。根据这些原则,我们提出了最佳实践,以实现自动化和可重复的模型开发过程。
translated by 谷歌翻译
While the capabilities of autonomous systems have been steadily improving in recent years, these systems still struggle to rapidly explore previously unknown environments without the aid of GPS-assisted navigation. The DARPA Subterranean (SubT) Challenge aimed to fast track the development of autonomous exploration systems by evaluating their performance in real-world underground search-and-rescue scenarios. Subterranean environments present a plethora of challenges for robotic systems, such as limited communications, complex topology, visually-degraded sensing, and harsh terrain. The presented solution enables long-term autonomy with minimal human supervision by combining a powerful and independent single-agent autonomy stack, with higher level mission management operating over a flexible mesh network. The autonomy suite deployed on quadruped and wheeled robots was fully independent, freeing the human supervision to loosely supervise the mission and make high-impact strategic decisions. We also discuss lessons learned from fielding our system at the SubT Final Event, relating to vehicle versatility, system adaptability, and re-configurable communications.
translated by 谷歌翻译
This paper presents our solutions for the MediaEval 2022 task on DisasterMM. The task is composed of two subtasks, namely (i) Relevance Classification of Twitter Posts (RCTP), and (ii) Location Extraction from Twitter Texts (LETT). The RCTP subtask aims at differentiating flood-related and non-relevant social posts while LETT is a Named Entity Recognition (NER) task and aims at the extraction of location information from the text. For RCTP, we proposed four different solutions based on BERT, RoBERTa, Distil BERT, and ALBERT obtaining an F1-score of 0.7934, 0.7970, 0.7613, and 0.7924, respectively. For LETT, we used three models namely BERT, RoBERTa, and Distil BERTA obtaining an F1-score of 0.6256, 0.6744, and 0.6723, respectively.
translated by 谷歌翻译
In recent years, social media has been widely explored as a potential source of communication and information in disasters and emergency situations. Several interesting works and case studies of disaster analytics exploring different aspects of natural disasters have been already conducted. Along with the great potential, disaster analytics comes with several challenges mainly due to the nature of social media content. In this paper, we explore one such challenge and propose a text classification framework to deal with Twitter noisy data. More specifically, we employed several transformers both individually and in combination, so as to differentiate between relevant and non-relevant Twitter posts, achieving the highest F1-score of 0.87.
translated by 谷歌翻译
Osteoarthritis (OA) is the most prevalent chronic joint disease worldwide, where knee OA takes more than 80% of commonly affected joints. Knee OA is not a curable disease yet, and it affects large columns of patients, making it costly to patients and healthcare systems. Etiology, diagnosis, and treatment of knee OA might be argued by variability in its clinical and physical manifestations. Although knee OA carries a list of well-known terminology aiming to standardize the nomenclature of the diagnosis, prognosis, treatment, and clinical outcomes of the chronic joint disease, in practice there is a wide range of terminology associated with knee OA across different data sources, including but not limited to biomedical literature, clinical notes, healthcare literacy, and health-related social media. Among these data sources, the scientific articles published in the biomedical literature usually make a principled pipeline to study disease. Rapid yet, accurate text mining on large-scale scientific literature may discover novel knowledge and terminology to better understand knee OA and to improve the quality of knee OA diagnosis, prevention, and treatment. The present works aim to utilize artificial neural network strategies to automatically extract vocabularies associated with knee OA diseases. Our finding indicates the feasibility of developing word embedding neural networks for autonomous keyword extraction and abstraction of knee OA.
translated by 谷歌翻译
Multilingual language models (MLMs) acquire valuable, generalizable linguistic information during pretraining and have advanced the state of the art on task-specific finetuning. So far, only ~ 28 out of ~2,000 African languages are covered in existing language models. We ameliorate this limitation by developing SERENGETI, a set of massively multilingual language model that covers 517 African languages and language varieties. We evaluate our novel models on eight natural language understanding tasks across 20 datasets, comparing to four MLMs that each cover any number of African languages. SERENGETI outperforms other models on 11 datasets across the eights tasks and achieves 82.27 average F-1. We also perform error analysis on our models' performance and show the influence of mutual intelligibility when the models are applied under zero-shot settings. We will publicly release our models for research.
translated by 谷歌翻译
Due to their crucial role in all NLP, several benchmarks have been proposed to evaluate pretrained language models. In spite of these efforts, no public benchmark of diverse nature currently exists for evaluation of Arabic. This makes it challenging to measure progress for both Arabic and multilingual language models. This challenge is compounded by the fact that any benchmark targeting Arabic needs to take into account the fact that Arabic is not a single language but rather a collection of languages and varieties. In this work, we introduce ORCA, a publicly available benchmark for Arabic language understanding evaluation. ORCA is carefully constructed to cover diverse Arabic varieties and a wide range of challenging Arabic understanding tasks exploiting 60 different datasets across seven NLU task clusters. To measure current progress in Arabic NLU, we use ORCA to offer a comprehensive comparison between 18 multilingual and Arabic language models. We also provide a public leaderboard with a unified single-number evaluation metric (ORCA score) to facilitate future research.
translated by 谷歌翻译
Task agnostic generative pretraining (GPT) has recently proved promising for zero- and few-shot learning, gradually diverting attention from the expensive supervised learning paradigm. Although the community is accumulating knowledge as to capabilities of English-language autoregressive models such as GPT-3 adopting this generative approach, scholarship about these models remains acutely Anglocentric. Consequently, the community currently has serious gaps in its understanding of this class of models, their potential, and their societal impacts in diverse settings, linguistic traditions, and cultures. To alleviate this issue for Arabic, a collection of diverse languages and language varieties with more than $400$ million population, we introduce JASMINE, a suite of powerful Arabic autoregressive Transformer language models ranging in size between 300 million-13 billion parameters. We pretrain our new models with large amounts of diverse data (400GB of text) from different Arabic varieties and domains. We evaluate JASMINE extensively in both intrinsic and extrinsic settings, using a comprehensive benchmark for zero- and few-shot learning across a wide range of NLP tasks. We also carefully develop and release a novel benchmark for both automated and human evaluation of Arabic autoregressive models focused at investigating potential social biases, harms, and toxicity in these models. We aim to responsibly release our models with interested researchers, along with code for experimenting with them
translated by 谷歌翻译